Replication of changed information in a multi-master environment

ABSTRACT

Changed information is provided to multiple masters of a multi-master environment. In order to facilitate the providing of the changed information to the various masters, at least one replication data structure is used. This data structure is managed in such a way that conflicts are avoided in updating the data structure, and thus, in communicating the changed information to the masters.

TECHNICAL FIELD

[0001] This invention relates, in general, to replicating changed information between servers of a computing environment, and in particular, to replicating changed information between multiple master servers of a multi-master server environment.

BACKGROUND OF THE INVENTION

[0002] Replication is a mechanism in which information held on one server, e.g., a directory server, is copied or replicated to one or more other servers, all of which provide access to the same set of information. Replication can be performed in various ways and based on different replication models. As examples, one replication model includes a master/slave model, while another model includes a multi-master model, each of which is described below.

[0003] In the master/slave replication model, a single server is the updateable master server, while the other servers are read-only slave servers. Although this model can assist in load-balancing heavily read-biased workloads, it is deficient in those environments that require highly available, workload-balanced read/write access to information in a directory. To address these needs, multi-master replication is employed.

[0004] With multi-master replication, multiple servers allow simultaneous write access to the same set of information by allowing each server to update its own copy of the set of information. Protocols are then used to transmit changes made at the different master servers to other master servers so that each server can update its copy of the information. Conflict resolution techniques are also used to workout differences resulting from simultaneous and/or conflicting updates made at multiple different master servers in the multi-master environment.

[0005] Although replication protocols have been established to manage the provision of changes made at multiple servers, enhancements are still needed. For example, a need still exists for a replication capability that provides the information in a simpler and more timely manner. As another example, a need still exists for a replication capability that avoids the need for conflict resolution at the time of providing the changed information.

SUMMARY OF THE INVENTION

[0006] The shortcomings of the prior art are overcome and additional advantages are provided through the provision of a method of facilitating the providing of change information to masters of a multi-master environment, wherein at least two masters of the multi-master environment have a copy of the replicated set of information. The method includes, for instance, writing by a master of the multi-master environment a change information entry to a data structure modifiable by a plurality of masters of the multi-master environment, wherein the change information entry corresponds to one or more changes of the master's copy of the replicated set of information; and providing the data structure to another master of the multi-master environment to enable the another master to update its copy of the replicated set of information.

[0007] System and computer program products corresponding to the above-summarized methods are also described and claimed herein.

[0008] Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed invention.

BRIEF DESCRIPTION OF TH DRAWINGS

[0009] The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:

[0010]FIG. 1a depicts one embodiment of a computing environment incorporating and using one or more aspects of the present invention;

[0011]FIG. 1b depicts another embodiment of a computing environment incorporating and using one or more aspects of the present invention;

[0012]FIGS. 2a and 2 c depict an embodiment of replication data structures, including a status data structure and a change data structure, used in accordance with an aspect of the present invention;

[0013]FIG. 2b depicts further details of the status data structure of FIG. 2a, in accordance with an aspect of the present invention;

[0014]FIG. 3 depicts one embodiment of the logic associated with updating the status data structure, in response to a replica being added to the environment, in accordance with an aspect of the present invention;

[0015]FIG. 4 depicts one embodiment of the logic associated with updating the status data structure, in response to a replica being deleted from the environment, in accordance with an aspect of the present invention;

[0016]FIG. 5 depicts one embodiment of the logic associated with updating the replication data structures to reflect the adding or modifying of information in a set of information, in accordance with an aspect of the present invention;

[0017]FIG. 6 depicts one embodiment of the logic associated with updating the replication data structures to reflect the deleting of information from the set of information, in accordance with an aspect of the present invention;

[0018]FIG. 7 depicts one embodiment of the logic associated with purging entries from the change data structure, in accordance with an aspect of the present invention; and

[0019]FIG. 8 depicts one embodiment of the logic associated with replication processing, in accordance with an aspect of the present invention.

BEST MODE FOR CARRYING OUT THE INVENTION

[0020] In accordance with an aspect of the present invention, a replication capability is provided for a multi-master environment in which changed information is replicated among multiple masters of the environment, in such a manner that conflict resolution is substantially avoided in communicating the changes. In a further aspect, change information is communicated, even if a particular server is not available at the time the change is made. This is accomplished by, for example, using one or more data structures to provide the changed information to the various masters.

[0021] One embodiment of a computing environment incorporating and using one or more aspects of the present invention is described with reference to FIG. 1a. In one example, a multi-master environment 100 includes a plurality (e.g., four) of masters 102 coupled to one another via one or more replication data structures 104. In one example, each master is a server, such as an LDAP directory server, residing on at least one of various computing units, including, for instance, RS/6000s, pSeries or zSeries main frames, offered by International Business Machines Corporation, Armonk, N.Y. Each master, also referred to as a replica, includes one or more sets of information 106, including, for instance, one or more directories.

[0022] A directory is a database, which is structured as a hierarchical tree of entries. An entry includes information in the form of attribute/value combinations. An attribute has a name, referred to as the attribute type, and one or more values. An attribute containing more than one value is referred to as multi-valued. Operations can be performed against the directory tree. These include adding an entry to the tree; modifying an entry; deleting an entry; searching for entries matching a filter, starting at some root entry; reading an entry; and comparing an attribute value to a test/assertion value. In one example, the directories are managed by one or more masters, which are also known as directory servers or replicas. A master provides access to one or more trees of entries, as well as some form of replication with other directory servers.

[0023] In one example, replication includes transmitting changes made to a directory between multiple masters holding copies of the directory. In accordance with an aspect of the present invention, the transmission includes employing at least one replication data structure 104. As one example, there is a set of replication data structures (e.g., a change table and a status table) for each replicated set of information (e.g., each replicated directory). That is, if there are two directories replicated on four masters, then there is a change table and a status table for each of the two directories. The tables (e.g., the change tables) serve as the communications medium by which the masters or replicas are informed of changes made to the directory entries. Further details regarding the replication data structures are described below.

[0024] One embodiment of a status data structure is described with reference to FIG. 2a. As one example, the status data structure includes a status table 200 having one or more rows 202. A row is allocated for each server that is participating in the replication environment. Thus, for the particular example of FIG. 1a, status table 200 includes four rows, one for each of Masters 1-4. Each row includes, for example, a ReplicaId 204, which is a primary key for the table and uniquely identifies the master that owns that row; and a LastChangeVector 206, which represents the last changes processed by this replica for changes that originated from the replicas participating in the replication.

[0025] In one example, the vector includes one or more pairs of values 210 (FIG. 2b), each pair identifying the replica 212 that originated the change on that replica's copy of the directory; and the last change processed 214 for the change originating replica by the replica 216 owning the row. For instance, for Replica 1, the last change it processed for changes originated by itself is change 2; the last change it processed for changes originated by Replica 2 is change 1; the last change it processed for changes originated by Replica 3 is change 4; etc. The LastChangeVector is used, for example, by a replica to determine the next set of changes to be processed; and as a further example, by the replicas to determine what set of changes have been processed by all the participating replicas so that clean-up of those changes can take place.

[0026] One embodiment of a change data structure is described with reference to FIG. 2c. In one example, the change data structure includes a change table 220 having one or more rows 222. There is a change table for each replicated set of information (e.g., each directory), and a row is created in the change table for each change that is made to the set of information corresponding to the change table, regardless of the master making the change. Thus, rows are added to the change table, in response to changes made to the directory by any master.

[0027] Rows are deleted from the change table, in response to any master noticing that the changes of the rows have been seen by all the masters participating in the replication environment. Rows in the change table are not updated. This enables multiple masters to simultaneously write to the change table without conflicts.

[0028] In one example, row 222 of the change table includes, for instance, a ChangeId 224, which is a primary key for the table, and is a value that uniquely identifies the change; a ReplicaId 226 that indicates the master or replica that performed the change; an EntryId 228 that indicates the entry in the directory upon which the change was made; and ChangeInformation 230 that includes the detailed set of changes that were applied to the entry by the replica that made the change. The ReplicaId aids in processing/searching the change table as changes sometimes are searched by the replica in which the change was made. The changes in the change table are provided to the various replicas of the environment, so that the replicas can apply the changes to their copies of the directory.

[0029] The replication data structures may be centrally located and remotely accessible to the masters, as depicted in FIG. 1a, or they, themselves, may be replicated and located at each master, as depicted in FIG. 1b. A combination of the above or other variations may also exist. In one embodiment, when the data structures are remotely located, a distributed two-phase commit is used to enable the data structure updates to be performed within the same transactional scope as the updates made to the directory server on which the change originated.

[0030] To manage the replication data structures, operations are applied thereto. As an example, the status table is modified (i.e., rows are added or deleted), in response to replicas being added or deleted from the environment. Further details regarding these operations are described below.

[0031] For example, one embodiment of the logic associated with modifying the status table to reflect an add of a replica to the environment is described with reference to FIG. 3. In response to being added to the environment, the replica adds a row to the status table, STEP 300. Thereafter, a unique ReplicaId is assigned to the row identifying the added replica, STEP 302. Further, LastChangeVector is initialized to values indicating the oldest possible change id (e.g., 0) for each replica participating in replication, STEP 304.

[0032] As a further example, the status table is updated when a replica is deleted from the computing environment. One embodiment of the logic associated with modifying the status table to reflect the deletion is described with reference to FIG. 4. In response to being deleted, the replica deletes its row from the status table, STEP 400. However, the LastChangeVector of the remaining replicas is not reduced in size at this time. This is to ensure that the remaining replicas can still process updates that are found in the change table which originated from the removed replica.

[0033] In addition to performing operations on the status table, operations are also performed on the change table. In one example, the change table is modified (i.e., a row is added), in response to adding an entry to the directory, modifying an existing directory entry or directory entry's name, or deleting a directory entry.

[0034] One embodiment of the logic associated with modifying the change table, as well as the status table, based on an add/modification to a directory entry is described with reference to FIG. 5. In response to adding a directory entry, the replica in which the add was performed adds a row to the change table in order to keep track of the entry being added, STEP 500. For example, a unique ChangeId and a ReplicaId of the replica adding the directory entry are specified. Additionally, an EntryId is provided, as well as ChangeInformation describing the addition.

[0035] Moreover, in one embodiment, the status table is updated to reflect the addition, STEP 502. For example, the replica's LastChangeVector value corresponding to itself is updated with the ChangeId of the row added to the change table. As an example, assume Replica 1 added a change identified as ChangeId 7, then the LastChangeVector in the row owned by Replica 1 is updated by changing the ChangeId corresponding to Replica 1 to 7 (e.g., 1,7).

[0036] Similar logic is performed when a directory entry is modified. For example, with reference to FIG. 5, to keep track of a directory entry being modified, the replica in which the modification was performed adds a row to the change table, which indicates the entry that was modified, STEP 500. Additionally, the status table is updated by changing the replica's LastChangeVector value corresponding to itself to the ChangeId of the row added to change table, STEP 502.

[0037] In addition to adding or modifying an entry, an existing entry's name may also be modified. Again, one embodiment of the logic associated with modifying the replication data structures to reflect a modification to an entry's name is described with reference to FIG. 5. To keep track of an entry's name being modified, the replica in which the name modification was performed adds a row to the change table indicating the entry's name that was modified, STEP 500. The status table is updated, as described previously, STEP 502. For instance, the replica's LastChangeVector value corresponding to itself is updated with the ChangeId of the row added to the change table.

[0038] A further operation that can be performed on a directory is deleting an entry from a directory. Thus, one embodiment of the logic associated with updating the replication data structures in view of a delete is described with reference to FIG. 6. To keep track of an entry being deleted, the replica in which the delete was performed adds a row to the change table which indicates that the entry was deleted, STEP 600. Additionally, the status table is updated, STEP 602. For example, the replica's LastChangeVector value corresponding to itself is updated with the ChangeId of the row added to the change table.

[0039] In the above processing, it is shown that in order to reflect a change made to the directory, a unique row is added to the change table. No row in the change table is updated, rather a new row is added. This enables conflict resolution to be avoided when updating the replication data structures to reflect the adding, modifying or deleting of entries in the directory.

[0040] After each replica participating in replication has processed a change, that row can be removed from the change table. There is no requirement that the rows be removed immediately after the replicas have processed them. However, it is desirous that they be removed at some point after all the replicas have processed the change.

[0041] There are various strategies for removing the old changes from the change table. The process of removing this information is called purge processing. Regardless of the strategy used (e.g., check on every update, check only after every n updates, separate background task), the tables are constructed such that any replica can perform purge processing, even multiple replicas simultaneously. The worst case scenario is that multiple replicas try to delete the same row from the status table. Duplicate deletion is an easy conflict to resolve, however. The additional deletion is simply ignored or from the replica's point of view, if the deletion fails because the row does not exist, it is considered normal processing and no error is provided.

[0042] One embodiment of the logic associated with purge processing is described with reference to FIG. 7. This processing is performed by any one of the replicas. Initially, a determination is made as to whether there is a replica to be processed, INQUIRY 700. That is, is there a remaining ReplicaId that has not been considered during the purge processing.

[0043] If there is a replica to participate in the processing, then for the selected replica, a determination is made as to the oldest ChangeId of the selected replica that has been processed by all the replicas, STEP 702. This is accomplished by examining the LastChangeVector for the replicas and determining the oldest ChangeId for the ReplicaId. Thereafter, all changes for that ReplicaId, where the ChangeId is older than or equal to the oldest ChangeId determined in STEP 702 is deleted from the change table, STEP 704.

[0044] Subsequently, a determination is made as to whether there are remaining changes for a ReplicaId that no longer exists, INQUIRY 706. In one example, this determination is made by searching the change table by ReplicaId. If there are no remaining changes for a ReplicaId which no longer exists in the set of replicas, then the status table may be updated, STEP 708. For example, a replica can remove the ChangeId value for that replica from its own LastChangeVector. In this example, replicas only update their own status table row, so a replica during purge processing can only update its own status table row. (In other embodiments, a replica can remove a replica from its own LastChangeVector during add, modify and/or delete processing.)

[0045] If, on the other hand, there are remaining changes for the ReplicaId, then processing continues with INQUIRY 700. When all of the participating replicas have been processed, the purge processing is complete, STEP 710.

[0046] The information held in the change table is used to inform masters participating in the replication environment of changes made to the directory by other masters. One embodiment of the logic associated with providing this changed information to one or more masters is described with reference to FIG. 8. As one example, each replica participating in replication periodically accesses the change table to learn of any changes, STEP 800. There are a number of different ways to process the change table, but in one example, it includes searching the change table based on ChangeId and ReplicaId to process, in ChangeId order, the updates that are newer than the ChangeId values held in the LastChangeVector for the replica. Since there is a ChangeId in the LastChangeVector for each ReplicaId, the processing may start at different points in the change table based on the ReplicaId in which the change was made.

[0047] In response to learning of changes, the replica applies the changes logged in the change table to its own copy of the directory, STEP 802. The manner in which these changes are applied are database dependent. Each database may be of its own type, have its own structure and/or format, and the master is able to understand the nature of the proposed updates, even though they originated from another server. Known techniques are used to reconcile any changes to be applied, if there are conflicts at the application time. Each master updates its own database in the structure and format of its database.

[0048] Thereafter, the replica updates its own row in the status table, STEP 804. In one example, this is accomplished by changing the LastChangeVector of the replica's row based on the changes it processed from the change table. For example, if ReplicaId 1 just applied change 7 of changes originating from ReplicaId 2, then the ChangeId in the LastChangeVector corresponding to ReplicaId 2 in the row owned by ReplicaId 1 is updated to 7, etc. This completes replication processing.

[0049] Described in detail above is a capability for facilitating the providing of changed information to multiple masters of a multi-master environment, while substantially avoiding conflict resolution during the providing of the changed information. The capability is relatively simple to use and provides the information in a timely manner. In one example, the capability employs one or more data structures to communicate the changed information.

[0050] In accordance with an aspect of the present invention, non-conflicting updates are made to the data structures for various operations applied thereto. Row additions are uniquely made, either by ReplicaId or ChangeId. Further, row updates of the status table are isolated to a single updater, since each status table row is owned by a particular replica.

[0051] Although in the above embodiments, the replication data structures are tables, this is only one example. Other types of data structures may also be used. Further, in another embodiment, techniques other than employing status data structures may be used in cleaning up the change data structures. The use of status data structures is only one example.

[0052] Additionally, in other embodiments, the LastChangeVector does not include values for the replica owning that row. Thus, the status table is not updated in those situations in which the owning replica is adding a row to the change table, as an example.

[0053] Moreover, although in the above embodiments, there is a change table for each replicated set of information, in other embodiments the change table may accommodate multiple replicated sets of information.

[0054] Further, even though the masters are servers in the examples above, other types of masters can participate in one or more aspects of the present invention.

[0055] In another aspect of the present invention, information in the change table is protected from viewing except from other masters, while the information is in the table and in transit between the masters. This is facilitated by using, for instance, a digital envelope technique. The digital envelope includes, for instance, encrypted bulk-encryption keys for each recipient server, which are encrypted using the public key of each recipient server. The bulk-encryption keys can be decrypted by each recipient using its private key. This is a form of multi-cast of enveloped information.

[0056] The present invention can be included in an article of manufacture (e.g., one or more computer program products) having, for instance, computer usable media. The media has embodied therein, for instance, computer readable program code means for providing and facilitating the capabilities of the present invention. The article of manufacture can be included as a part of a computer system or sold separately.

[0057] Additionally, at least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform the capabilities of the present invention can be provided.

[0058] The flow diagrams depicted herein are just examples. There may be many variations to these diagrams or the steps (or operations) described therein without departing from the spirit of the invention. For instance, the steps may be performed in a differing order, or steps may be added, deleted or modified. All of these variations are considered a part of the claimed invention.

[0059] Although preferred embodiments have been depicted and described in detail herein, it will be apparent to those skilled in the relevant art that various modifications, additions, substitutions and the like can be made without departing from the spirit of the invention and these are therefore considered to be within the scope of the invention as defined in the following claims. 

What is claimed is:
 1. A method of facilitating the providing of change information to masters of a multi-master environment, wherein at least two masters of the multi-master environment have a copy of a replicated set of information, said method comprising: writing by a master of the multi-master environment a change information entry to a data structure modifiable by a plurality of masters of the multi-master environment, wherein the change information entry corresponds to one or more changes of the master's copy of the replicated set of information; and providing the data structure to another master of the multi-master environment to enable the another master to update its copy of the replicated set of information.
 2. The method of claim 1, wherein the replicated set of information comprises a replicated database.
 3. The method of claim 1, further comprising updating by the master a status data structure to reflect the processing by the master of the one or more changes of the change information entry, said status data structure being accessible to one or more masters of the multi-master environment.
 4. The method of claim 3, further comprising cleaning up, by one or more masters of the multi-master environment, the data structure based on information provided by the status data structure.
 5. The method of claim 1, wherein the providing comprises: replicating the data structure; and forwarding the replicated data structure to the another master.
 6. The method of claim 1, wherein the providing comprises centrally locating the data structure to be accessible by the another master.
 7. The method of claim 1, wherein the master's copy of the replicated set of information has at least one of a different format and a different structure from the another master's copy of the replicated set of information.
 8. The method of claim 1, further comprising deleting by one or more masters of the multi-master environment one or more change information entries from the data structure, in response to the one or more change information entries no longer being needed.
 9. The method of claim 8, further comprising employing a status data structure to determine when the one or more change information entries are no longer needed.
 10. The method of claim 1, wherein the writing comprises writing by a plurality of masters of the multi-master environment a plurality of unique change information entries to the data structure, wherein change information entries of the data structure are not modifiable, and wherein the providing comprises providing the data structure to one or more masters of the multi-master environment to enable the one or more masters to update their copies of the replicated set of information.
 11. A system of facilitating the providing of change information to masters of a multi-master environment, wherein at least two masters of the multi-master environment have a copy of a replicated set of information, said system comprising: means for writing by a master of the multi-master environment a change information entry to a data structure modifiable by a plurality of masters of the multi-master environment, wherein the change information entry corresponds to one or more changes of the master's copy of the replicated set of information; and means for providing the data structure to another master of the multi-master environment to enable the another master to update its copy of the replicated set of information.
 12. The system of claim 11, further comprising means for updating by the master a status data structure to reflect the processing by the master of the one or more changes of the change information entry, said status data structure being accessible to one or more masters of the multi-master environment.
 13. The system of claim 12, further comprising means for cleaning up, by one or more masters of the multi-master environment, the data structure based on information provided by the status data structure.
 14. The system of claim 11, wherein the means for providing comprises: means for replicating the data structure; and means for forwarding the replicated data structure to the another master.
 15. The system of claim 11, wherein the means for providing comprises means for centrally locating the data structure to be accessible by the another master.
 16. The system of claim 11, wherein the master's copy of the replicated set of information has at least one of a different format and a different structure from the another master's copy of the replicated set of information.
 17. The system of claim 11, further comprising means for deleting by one or more masters of the multi-master environment one or more change information entries from the data structure, in response to the one or more change information entries no longer being needed.
 18. The system of claim 17, further comprising means for employing a status data structure to determine when the one or more change information entries are no longer needed.
 19. The system of claim 11, wherein the means for writing comprises means for writing by a plurality of masters of the multi-master environment a plurality of unique change information entries to the data structure, wherein change information entries of the data structure are not modifiable, and wherein the means for providing comprises means for providing the data structure to one or more masters of the multi-master environment to enable the one or more masters to update their copies of the replicated set of information.
 20. A system of facilitating the providing of change information to masters of a multi-master environment, wherein at least two masters of the multi-master environment have a copy of a replicated set of information, said system comprising: a master of the multi-master environment to write a change information entry to a data structure modifiable by a plurality of masters of the multi-master environment, wherein the change information entry corresponds to one or more changes of the master's copy of the replicated set of information; and wherein the data structure is provided to another master of the multi-master environment to enable the another master to update its copy of the replicated set of information.
 21. At least one program storage device readable by a machine, tangibly embodying at least one program of instructions executable by the machine to perform a method of facilitating the providing of change information to masters of a multi-master environment, wherein at least two masters of the multi-master environment have a copy of a replicated set of information, said method comprising: writing by a master of the multi-master environment a change information entry to a data structure modifiable by a plurality of masters of the multi-master environment, wherein the change information entry corresponds to one or more changes of the master's copy of the replicated set of information; and providing the data structure to another master of the multi-master environment to enable the another master to update its copy of the replicated set of information.
 22. The at least one program storage device of claim 21, wherein the replicated set of information comprises a replicated database.
 23. The at least one program storage device of claim 21, wherein said method further comprises updating by the master a status data structure to reflect the processing by the master of the one or more changes of the change information entry, said status data structure being accessible to one or more masters of the multi-master environment.
 24. The at least one program storage device of claim 23, wherein said method further comprises cleaning up, by one or more masters of the multi-master environment, the data structure based on information provided by the status data structure.
 25. The at least one program storage device of claim 21, wherein the providing comprises: replicating the data structure; and forwarding the replicated data structure to the another master.
 26. The at least one program storage device of claim 21, wherein the providing comprises centrally locating the data structure to be accessible by the another master.
 27. The at least one program storage device of claim 21, wherein the master's copy of the replicated set of information has at least one of a different format and a different structure from the another master's copy of the replicated set of information.
 28. The at least one program storage device of claim 21, wherein said method further comprises deleting by one or more masters of the multi-master environment one or more change information entries from the data structure, in response to the one or more change information entries no longer being needed.
 29. The at least one program storage device of claim 28, wherein said method further comprises employing a status data structure to determine when the one or more change information entries are no longer needed.
 30. The at least one program storage device of claim 21, wherein the writing comprises writing by a plurality of masters of the multi-master environment a plurality of unique change information entries to the data structure, wherein change information entries of the data structure are not modifiable, and wherein the providing comprises providing the data structure to one or more masters of the multi-master environment to enable the one or more masters to update their copies of the replicated set of information. 